Avoiding Communication through a Multilevel LU Factorization

نویسندگان

Simplice Donfack

Laura Grigori

Amal Khabou

چکیده

Due to the evolution of massively parallel computers towards deeper levels of parallelism and memory hierarchy, and due to the exponentially increasing ratio of the time required to transfer data, either through the memory hierarchy or between different compute units, to the time required to compute floating point operations, the algorithms are confronted with two challenges. They need not only to be able to exploit multiple levels of parallelism, but also to reduce the communication between the compute units at each level of the hierarchy of parallelism and between the different levels of the memory hierarchy. In this paper we present an algorithm for performing the LU factorization of dense matrices that is suitable for computer systems with two levels of parallelism. This algorithm is able to minimize both the volume of communication and the number of messages transferred at every level of the two-level hierarchy of parallelism. We present its implementation for a cluster of multicore processors based on MPI and Pthreads. We show that this implementation leads to a better performance than routines implementing the LU factorization in well-known numerical libraries. For matrices that are tall and skinny, that is they have many more rows than columns, our algorithm outperforms the corresponding algorithm from ScaLAPACK by a factor of 4.5 on a cluster of 32 nodes, each node having two quad-core Intel Xeon EMT64 processors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LU factorization with panel rank revealing pivoting and its communication avoiding version

We present the LU decomposition with panel rank revealing pivoting (LU PRRP), an LU factorization algorithm based on strong rank revealing QR panel factorization. LU PRRP is more stable than Gaussian elimination with partial pivoting (GEPP), with a theoretical upper bound of the growth factor of (1+ τb) n b , where b is the size of the panel used during the block factorization, τ is a parameter...

متن کامل

Performance Predictions of Multilevel Communication Optimal LU and QR Factorizations on Hierarchical Platforms

In this paper we study the performance of two classical dense linear algebra algorithms, the LU and the QR factorizations, on multilevel hierarchical platforms. We note that we focus on multilevel QR factorization, and give a brief description of the multilevel LU factorization. We first introduce a performance model called Hierarchical Cluster Platform (Hcp), encapsulating the characteristics ...

متن کامل

Calculs pour les matrices denses : coût de communication et stabilité numérique. (Dense matrix computations : communication cost and numerical stability)

This dissertation focuses on a widely used linear algebra kernel to solve linear systems, that is the LU decomposition. Usually, to perform such a computation one uses the Gaussian elimination with partial pivoting (GEPP). The backward stability of GEPP depends on a quantity which is referred to as the growth factor, it is known that in general GEPP leads to modest element growth in practice. H...

متن کامل

THE USE OF SEMI INHERITED LU FACTORIZATION OF MATRICES IN INTERPOLATION OF DATA

The polynomial interpolation in one dimensional space R is an important method to approximate the functions. The Lagrange and Newton methods are two well known types of interpolations. In this work, we describe the semi inherited interpolation for approximating the values of a function. In this case, the interpolation matrix has the semi inherited LU factorization.

متن کامل

Parallel Multilevel Block ILU Preconditioning Techniques for Large Sparse Linear Systems

We present a class of parallel preconditioning strategies built on a multilevel block incomplete LU (ILU) factorization technique to solve large sparse linear systems on distributed memory parallel computers. The preconditioners are constructed by using the concept of block independent sets. Two algorithms for constructing block independent sets of a distributed sparse matrix are proposed. We c...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Avoiding Communication through a Multilevel LU Factorization

نویسندگان

چکیده

منابع مشابه

LU factorization with panel rank revealing pivoting and its communication avoiding version

Performance Predictions of Multilevel Communication Optimal LU and QR Factorizations on Hierarchical Platforms

Calculs pour les matrices denses : coût de communication et stabilité numérique. (Dense matrix computations : communication cost and numerical stability)

THE USE OF SEMI INHERITED LU FACTORIZATION OF MATRICES IN INTERPOLATION OF DATA

Parallel Multilevel Block ILU Preconditioning Techniques for Large Sparse Linear Systems

عنوان ژورنال:

اشتراک گذاری